1
The Landscape of Advanced Generative AI
PolyU COMP5511 Lesson 11
00:00

The landscape of Advanced Generative AI has evolved from isolated, monolithic models to a multi-layered ecosystem defined by Compound AI Systems. This shift moves away from simple probabilistic token prediction toward systems that orchestrate foundation models (FMs), modular plugins, and cross-modal synthesis.

Compute / Cloud Infrastructure LLMs Diffusion Audio / Code Orchestration & Agentic Layer

The Generative Stack Taxonomy

  • Infrastructure Layer: The hardware backbone (GPUs/TPUs) and cloud services that provide the massive compute required for training and high-speed inference.
  • Model Layer: The Foundation Models (FMs) such as GPT-4, Llama 3, and Stable Diffusion that serve as the specialized engines for different modalities.
  • Orchestration Layer: Frameworks that manage logic, data flow, and retrieval, transitioning models from "frozen" weights to systems with Real-time Contextual Awareness.

Modality Convergence

The technical trend focuses on unifying architecturesโ€”primarily Transformers and Diffusion modelsโ€”allowing for a shared latent space. This enables a single unified interface where text, image, and video are manipulated as a continuous stream of information, represented mathematically as a mapping between disparate latent manifolds $M_{text} \leftrightarrow M_{visual}$.

Structural Evolution
We are moving from "Closed-Book" models that rely solely on training data parameters $\theta$, to "Open-Book" systems that use external environment state $E$ to solve complex reasoning tasks via $P(y|x, E)$.
Python Implementation